Coding challenge for Climate Farmers
I decided to work with Portugal, because it includes various types of landcover, and has variability in soil organic carbon and climate. I imported the country shape from the gadm database GADM database. As I want to work with mainland Portugal, I excluded Madeira and the Azores.
Shape of mainland Portugal used to restrict the analysis of soil, landcover and climate data.
Temperature (°C) in Portugal for January 2020.
Evapotranspiration (m) in Portugal for January 2020.
Precipitation (m) in Portugal for January 2020.
Land cover classes in Portugal in 2020.
The label and color information was extracted from the metadata of the land cover layer. However, I decided to change the color for the three cropland rainfed classes slightly, so they could be distinguished better.
## SpatRaster resampled to ncells = 500760
Soil organic carbon content (t/ha) in Portugal.
| Layer Name | Resolution (lon) | Resolution (lat) | Origin (lon) | Origin (lat) |
|---|---|---|---|---|
| Climate variables | 0.1 | 0.1 | -9.58 | 36.97 |
| Land Cover | 0.0028 | 0.0028 | -9.5472 | 36.9611 |
| Soil Organic Carbon | 0.0023 | 0.0024 | -9.5468 | 36.9609 |
The climate variables (which share the same spatial dimensions) are at a lower resolution than the land cover and SOC layer and also have different origins. Therefore I resampled the latter to the same resolution, origin and extent as the climate variables.
Land cover is a categorical variable. I explored two options to resample this data. First using the method “nearest neighbor”, which is typically used for categorical variables,as it assigns new pixel values by selecting the nearest original pixel value without any interpolation, effectively copying the closest value to the new pixel location. However, when resampling to a lower resolution it might be more interesting using the class which most high resolution pixels have within the lower resolution. I tested that option as well and compared the maps visually in terms of pattern.
# resample option 1 using nearest neighbor
landcover_pt_near <- terra::resample(landcover_pt, resample_raster, method = "near")
# resample option 2 using majority
landcover_pt_majority <-
exactextractr::exact_resample(landcover_pt,
resample_raster,
'majority')
Land cover classes in Portugal in 2020 resampled to climate data with two different methods.
When resampling with method “near” the landcover classes are very patchy and fragmented. The layer resampled with majority has more continuous representation of classes and the logic of that resampling method is more sound, so I am using that layer for the analysis.
Soil organic carbon (SOC) is a continous variable. I want to work
with the mean SOC per lower resolution cell. Typically continuous
rasters are resampled using method bilinear, which calculates values of
a grid location based on nearby grid cells, using a weighted average of
the four nearest cell centers. I tested this option, as well as using an
average function within the exact_resample() function used
for land cover. This function aggregates cells before resampling, so
that the average is not based on four grid-cells but all grid cells
covered by the lower resolution cell. I compared the maps visually in
terms of pattern.
# resample option 1
SOC_pt_bil <- resample(landcover_pt, resample_raster, method = "bilinear")
# resample option 2
SOC_pt_mean <- exactextractr::exact_resample(SOC_pt,
resample_raster,
'mean')
Soil organic carbon layer in Portugal resampled with two different methods.
When resampling with method ´exact_resample´ and function “mean” the pattern of low SOC values along the cost and high values in the North of Portugal is maintained, therefore I am going to use that layer for the analysis.
I can then check whether the dimensions for all layers match before proceeding with the analysis using the ´compareGeom()´ function.
compareGeom(evapotransp_pt, precipitation_pt, temperature_pt,
landcover_pt, SOC_pt)
Now I can analyse climate and soil organic carbon within the land cover classes in Portugal. First I want to explore what the share of each landcover class is.
Land cover classes of Portugal and their proportions. All classes with bars below the dashed line were excluded, as well as water and NA.
For the next steps I excluded the land cover classes with less than 1% of overall pixels (i.e. 10 pixels or less), as well as pixels that had land cover “water” or “NA”, which were 2.5% and 1% respectively of all pixels.
We can look at climate variables per land cover class over time. Depending on what needs to be communicated, plots can highlight different parts. I have first looked at how the mean values change over the months from 2020 to 2022 for all landcover classes together. This is a nice way to show where certain landcover classes behave differently than others.
Mean temperature in different land cover classes in Portugal from 2020 to 2022.
Mean precipitation in different land cover classes in Portugal from 2020 to 2022.
Mean evapotranspiration in different land cover classes in Portugal from 2020 to 2022.
Depending on what the focus of the visualization should be, I could also look at each land cover class separately. This visualization is better to compare overall differences in the individual patterns and allows to plot the standard deviation as errors around each line, which isn’t very visible in the combined plot. I am showing temperature here as an example.
title title titles.
I can visualize mean soil organic carbon per landcover class. As I only have one time point for this layer I used a bar plot for visualisation.
Average soil organic carbon (t/ha) per land cover classes of Portugal.
Error bars show the standard deviation around the mean per class.
Calculate necessary sample size following:
\[ n = \left(\frac{z \times \sigma}{E}\right)^2 \]
to detect changes in SOC if I want a 95% confidence interval equal to or less than 10% of the mean value, assuming a Gaussian distribution.The formula calculates the required sample size (n) needed to estimate a population mean within a desired margin of error (E) at a specified confidence level. It considers the variability of the population (σ) and the critical value from the standard normal distribution (z).
The sample size can thus be calculated with the following code:
# calculate variables for sample size
confidence_level <- 0.95
z_score <- qnorm(1 - (1 - confidence_level) / 2)
# calculate standard deviation of soil class
standard_deviation <- sd(sampling_env$SOC_df$SOC)
desired_width <- 0.1 * mean(sampling_env$SOC_df$SOC)
# Calculate the number of samples required
nr_samples <- (z_score * standard_deviation / desired_width) ^ 2
Working with the landcover at a ~1°̇ resolution, the necessary sample size thus NA. This calculated value represents the minimum sample size needed for the analysis.
Working with the landcover at the original 250 m resolution and thus a higher variability in the data, the necessary sample size would be 40.
If we wanted to sample these points in space we could use the
spatSample() function to suggest random coordinates within
Portugal.
Random sampling scheme for the soil organic carbon content (t/ha) in Portugal.
We can also design a sampling scheme for soil organic carbon content that is stratified for land cover classes. In this case we calculate the necessary sample size for the SOC values of all pixels with that landcover class.
Stratified sampling scheme for the soil organic carbon content (t/ha) within land cover classes in Portugal.
I implemented a very simple soil model using the package ‘soilR’ and RothC. The setup assumes that the only information available are the percent clay content in the topsoil, which I extracted for a point within Portugal from the SoilGrids database, an assumed annual amount of litter inputs, and monthly averages of climatic variables for that same point. The model is run 300 years into the future.
Output of simple soil model using RothC.
The final pool sizes of Dissolved and Particulate Matter (DPM), Resistant Particulate Matter (RPM), Biomass (BIO), Humus (HUM), and Inert Organic Matter (IOM) for this point in Portugal with assumed parameters are then:
| DPM | RPM | BIO | HUM | IOM |
|---|---|---|---|---|
| 0.1477609 | 2.1369862 | 0.2786526 | 11.4037173 | 5.4357393 |